Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 1.0.2-beta #109

Merged
merged 20 commits into from
Nov 20, 2024
Merged

Conversation

dmepham
Copy link
Collaborator

@dmepham dmepham commented Nov 20, 2024

Description

Adds in the changes for using the cloudzero KSM. The agent will use cloudzero-ksm by default.

References

See the initial PR for reference

Testing

Tested in the previous PRs, and doing some more in progress

  • This change adds test coverage for new/changed/fixed functionality

Checklist

  • I have added documentation for new/changed functionality in this PR
  • All active GitHub checks for tests, formatting, and security are passing
  • The correct base branch is being used, if not main

dmepham and others added 13 commits November 18, 2024 12:06
… subchart (#91)

* override KSM name

* enable ksm by default

* make cloudzero ksm undiscoverable

* improve documentation

* option 2 is not the default behavior

* fix indentation

* add line

* add documentation for changing the service port for cloudzero ksm

* disable cloudzero KSM as scrape target

* set default port

* fix endpoint

* use default port

* add release notes

* remove metric exporter documentation

* change beta version
* change kube-state-metrics value name to avoid template errors

* define static target

* fix kube-state-metrics dependency

* remove unused documentation

* cast port to int

* fix endpoint

* update scrape config

* dynamicaly populate metrics

* use camel case
@dmepham dmepham requested a review from a team as a code owner November 20, 2024 21:34
Copy link
Contributor

@wreckedred wreckedred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you - lgtm

@dmepham dmepham deployed to release-notes November 20, 2024 22:39 — with GitHub Actions Active
Copy link
Contributor

@beckilee beckilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@dmepham dmepham merged commit 6d91d10 into feature/1.0.2-beta-release Nov 20, 2024
3 checks passed
@dmepham dmepham deleted the CP-2351-labels-beta-ksm branch November 20, 2024 22:46
dmepham added a commit that referenced this pull request Nov 21, 2024
* CP-23051: Change default kube-state-metrics behavior to use Cloudzero subchart (#91)

* override KSM name

* enable ksm by default

* CP-23388: Define Static KubeStateMetrics Target Endpoint (#99)

* add 1.0.2 release doc file

---------

Co-authored-by: bdrennz <[email protected]>
@josephbarnett josephbarnett mentioned this pull request Jan 9, 2025
3 tasks
dmepham added a commit that referenced this pull request Jan 28, 2025
* CP-22731: add insights-controller chart (#97)

* CP-22731: include cz-insights-controller as subchart

* increase replicacount for tag server

* CP-22731: add beta testing

* update release process for insights controller

* update release workflow

* make most resources off by default

* update readme

* use global for secret names

* incorporate changes from 0.0.30-beta

* add beta release doc

* use local chart for testing

---------

Co-authored-by: josephbarnett <[email protected]>

* CP-22730: use correct pattern list in config

* CP-22730: update doc check location to match normal release path (#100)

* Update Chart.yaml to version 1.0.0-beta

* use latest insights-controller

* CP-23435: remove duplicate service account name in insights-controller chart (#103)

* CP-23426: use insights-controller service account for init job (#104)

* CP-23465: increase default replica count for insights controller (#106)

* CP-23423: add release doc for 1.0.1-beta release (#107)

* [CP-23425] add default remote write retries (#108)

* CP-23425: set default max retries

* update init job to work with long running scrapes

* increase wait time for scrape endpoint

* default batch size added

* increase wait time for init job

* adjust remote write threshold, add default resource values

* Release 1.0.2-beta (#109)

* CP-23051: Change default kube-state-metrics behavior to use Cloudzero subchart (#91)

* override KSM name

* enable ksm by default

* CP-23388: Define Static KubeStateMetrics Target Endpoint (#99)

* add 1.0.2 release doc file

---------

Co-authored-by: bdrennz <[email protected]>

* move release doc to correct location

* Update Chart.yaml to version 1.0.2-beta

* CP-22730: package both charts in beta release (#110)

* CP-22730: fix artficat name (#111)

* CP-22730: fix doc path for github release publish (#112)

* CP-23740 (Feature/1.0.3 beta release): Validate KSM Metrics at Install (#116)

* remove unused metric

* add kubemetrics

* bump chart version for beta

* use dev tag for validator

* fix endpoint var name

* allow github to bump version

* simplify metric logic

* update tag

* use dev tag for chart

* [CP-23429] merge insights-controller into main chart (#117)

* insights-controller added to agent chart

* [CP-23428] add helm chart for creating cert (#118)

* CP-23428: add certificate helm chart

* update with documentation comments

* Update charts/cloudzero-agent/README.md

Co-authored-by: Becki Lee <[email protected]>

* Update charts/cloudzero-agent/README.md

remove duplicate entry

Co-authored-by: Becki Lee <[email protected]>

* Update charts/cloudzero-agent/README.md

add period to end of sentence in readme

Co-authored-by: Becki Lee <[email protected]>

* PR suggestion for readme

* update config example

---------

Co-authored-by: Becki Lee <[email protected]>

* CP 24028 add insights controller scape config (#120)

CP-24028: add scrape target for insights container
CP-22734: Bump insights image release version
Enhance README for helm repo management
Add release note for next beta version
Update release process for customer version numbers in betas

* Update Chart.yaml to version 1.0.0-beta-4

* CP 23892 add healthcheck (#121)

* CP-23892, CP-24009, CP-23959: release note
* add healthcheck support
* bump value of insights controller

* Update Chart.yaml to version 1.0.0-beta-5

* fix beta deploy script

* CP-24118: affinity settings, release notes (#122)

* CP-24118: add pod best effort affinity rule for distributing pod instances accross nodes
* allow override of KSM in configuration
* add next release notes
* bump version of controller and validator
* fix table in release note

* CP-23452 Add recommended installation skills to README (#124)

* CP-24008: forward insights controller app metrics (#125)

* CP-24389: deprecate unused chart (#126)

* CP-20221: Labels and Annotations (#127)

* bump final version of insights controller
* Adding release notes for 1.0.0 release
* Adding cert troubleshooting guide

---------

Co-authored-by: Becki Lee <[email protected]>

* publish material for beta-6 (#128)

* update readme, add extra svc names to cloudzero-cert, add cloudzero-cert chart publish (#129)

* [CP-24464] default to create self-signed cert upon chart install (#130)

* default to create self-signed cert upon chart install

* Update charts/cloudzero-agent/docs/releases/1.0.0-beta-7.md

Co-authored-by: Becki Lee <[email protected]>

* Update charts/cloudzero-agent/docs/releases/1.0.0-beta-7.md

Co-authored-by: Becki Lee <[email protected]>

* Update charts/cloudzero-agent/docs/releases/1.0.0-beta-7.md

Co-authored-by: Becki Lee <[email protected]>

* Update charts/cloudzero-agent/README.md

Co-authored-by: Becki Lee <[email protected]>

* Update charts/cloudzero-agent/README.md

Co-authored-by: Becki Lee <[email protected]>

* Update charts/cloudzero-agent/README.md

Co-authored-by: JB <[email protected]>

* Update charts/cloudzero-agent/README.md

Co-authored-by: JB <[email protected]>

---------

Co-authored-by: Becki Lee <[email protected]>
Co-authored-by: JB <[email protected]>

* enable new metric for insights controller failures (#132)

* CP-24424: change init scrape job to use new -backfill option (#131)

Previously, the scrape job would use curl to hit a /scrape HTTP
endpoint on the webhook server. This was problematic on larger clusters
where the operation takes a long time since the HTTP context was
getting cancelled before the operation completed.

This patch switches to using a new -backfill option on the controller
binary, which causes the binary to run the backfiller (née scraper) and
exit instead of acting as an HTTPd.

* remove certificate chart from beta workflow (#133)

* Update Chart.yaml to version 1.0.0-beta-7

* add back missing packaging (#134)

* add upgrade command to beta-7 release notes (#135)

* CP-24743: allow all resources to use imagePullSecrets (#136)

* CP-24743: add imagePullSecrets to cert job

* Update Chart.yaml to version 1.0.0-beta-8

* CP-24792: allow more configurable settings, increase default remote write timeout (#137)

* CP-24792: allow more configurable settings, increase default remote write timeout

* CP-24792: add KSM image info for easy identification of images to mirror for private image registries (#139)

* CP-24792: add KSM image info for easy identification of images to mirror to private repos

* add template command for finding images

---------

Co-authored-by: Becki Lee <[email protected]>

* CP-24833: template KSM service address using the release name (#140)

* Update Chart.yaml to version 1.0.0-beta-9

* CP-24886: ensure KSM service and KSM target always match (#143)

* CP-24886: ensure ksm svc and target match

* Update NOTES.txt

---------

Co-authored-by: Thomas Evans <[email protected]>

* Update Chart.yaml to version 1.0.0-beta-10

* Add server.agentMode boolean configuration option

This just provides a convenient way to toggle agent mode on/off for
debugging, which is valuable since agent mode disables a *lot* of
Prometheus functionality which can be very useful for debugging, such
as the /graph endpoint.

* Add metric_relabel_configs to insights controller scrape job.

This should just restrict the metrics to those we're interested in,
as defined in values.yaml.

* CP-23129: add Prometheus scrape job to scrape metrics from itself

I also switched from a hardcoded value it to using
`prometheusConfig.scrapeJobs.kubeStateMetrics.scrapeInterval` for the
KSM job scrape_interval. This seems to pretty clearly be the intent
of the configuration option, but it was not being used. Notably, this
increases the interval from 1m to 2m.

* [CP-24912] use image tag and chart name in init job name (#144)

* always use insightsController image reference in init scrape job name

* CP-24655: use backfill instead of scrape for init job that gathers existing state (#145)

* CP-25115: add release notes for 1.0.0-rc1 release (#147)

* CP-24655: add release nodes for RC1

* fix main chart release in rel branch (#151)

* CP-25165: allow user to choose release branch in main chart release (#152)

* CP-25165: checkout given branch (#153)

* CP-25165: checkout given branch in correct order (#154)

* CP-25165: checkout the input branch, not main (#155)

* Basic install success message. (#149)

* Update charts/cloudzero-agent/Chart.yaml

Co-authored-by: JB <[email protected]>

* CP-25270: prepare release/1.0.0 for merging (#158)

* update docs, remove cert-manager references from test, add missing quote

---------

Co-authored-by: josephbarnett <[email protected]>
Co-authored-by: Automated CZ Release <[email protected]>
Co-authored-by: bdrennz <[email protected]>
Co-authored-by: Becki Lee <[email protected]>
Co-authored-by: JB <[email protected]>
Co-authored-by: evan-cz <[email protected]>
Co-authored-by: Thomas Evans <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants